168 PART 4 Comparing Groups

»

» It’s flexible! The test works for tables with any number of rows and columns,

and it easily handles cell counts of any magnitude. Statistical software can

usually complete the calculations quickly, even on big data sets.

But the chi-square test has some shortcomings:»

» It’s not an exact test. The p value it produces is only approximate, so using

p

0 05

.

as your criterion for statistical significance (meaning setting α = 0.05)

doesn’t necessarily guarantee that your Type I error rate will be only

5 percent. Remember, your Type I error rate is the likelihood you will claim

statistical significance on a difference that is not true (see Chapter 3 for an

introduction to Type I errors). The level of accuracy of the statistical signifi-

cance is high when all the cells in the table have large counts, but it becomes

unreliable when one or more cell counts is very small (or zero). There are

different recommendations as to the minimum counts you need per cell in

order to confidently use the chi-square test. A rule of thumb that many

analysts use is that you should have at least five observations in each cell of

your table (or better yet, at least five expected counts in each cell).»

» It’s not good at detecting trends. The chi-square test isn’t good at detecting

small but steady progressive trends across the successive categories of an

ordinal variable (see Chapter 4 if you’re not sure what ordinal is). It may give a

significant result if the trend is strong enough, but it’s not designed specifically

to work with ordinal categorical data. In those cases, you should use a

Mantel-Haenszel chi-square test for trend, which is outside the scope of this

book.

Modifying the chi-square test: The

Yates continuity correction

There is a little drama around the original Pearson chi-square of association test

that needs to be mentioned here. Yates, who was a contemporary of Pearson,

developed what is called the Yates continuity correction. Yates argued that in the

special case of the fourfold table, adding this correction results in more reliable

p values. The correction consists of subtracting 0.5 from the magnitude of the

(Ob

Ex

) difference before squaring it.

Let’s apply the Yates continuity correction for your analysis of the sample data in

the earlier section “Understanding how the chi-square test works.” Take a look at

Figure 12-3, which has the differences between the values in the observed and

expected cells. The application of the Yates correction changes the 7.20 (or –7.20)

difference in each cell to 6.70 (or –6.70). This lowers the chi-square value from